Description

The course is designed for students who want to start their career in Big Data or for the professional folks who want to shift their career from current technology into Big Data. During the course, Google Cloud will be used. Candidates will be helped with interview questions and resume preparation also. The course will cover Mapreduce, HDFS, SQOOP, HIVE, SPARK, UNIX, and Scala.

This is an instructor-led course with an average batch size of 5 students. In the 60 hours of Online Live training, you will get both the theoretical and practical knowledge needed to build the necessary skills. The institute’s holistic approach is stemmed to meet the long-term needs of the student and hence they provide 100% job/placement assistance with the option of seeking a trial class before the enrolment.

What Will I learn?

  • Big Data Introduction, Four Vs of Big Data and MapReduce and HDFS
  • Unix Concepts, Basic Unix Commands and How to write a shell script
  • Scala Basics, Variable, Strings and Numbers

Specifications

  • Free Demo
  • Learn from Experts
  • Interactive Learning
  • Instalment Facility

Big Data Introduction

  • Big Data Introduction
  • What is big data and Why Big Data?
  • Four Vs of Big Data
  • Scaling problems with the existing system and how Hadoop resolved them
  • What are MapReduce and HDFS
  • Different Hadoop vendors in the industry

 

Unix

  • Unix Concepts
  • Introduction to Unix
  • Basic Unix Commands
  • How to write a shell script

 

HDFS & its Architecture

  • Distributed Computing – Name Node and Data Node concepts
  • HDFS Introduction and Architecture
  • What are blocks in HDFS and how they make Hadoop Fault Tolerant
  • What is Secondary Namenode
  • What is checkpointing in Hadoop 1.0
  • Difference between Hadoop 1.0 vs Hadoop 2.0
  • HDFS configuration file and how to change block size on cluster
  • Hadoop File System Commands
  • Assignment on HDFS

 

MapReduce and Its Architecture

  • Different phases of MapReduce and Execution Flow
  • What is Input Split in MR
  • Word Count problem In MR
  • Joining Problem In MR
  • How to develop and submit MR code on Hadoop Cluster
  • Assignment on MapReduce

 

Yarn

  • Why Yarn
  • Components of Yarn & Architecture
  • How Resource Master function
  • Node manager responsibilities
  • How Application masterwork
  • Different Schedulers in Yarn

 

Sqoop

  • What is Sqoop and why it is used
  • Import Data from RDBMS to HDFS
  • Full vs Incremental Data Import
  • Different File formats to Import Data
  • Various methods to Import Data
  • Performance Tuning
  • Sqoop Jobs
  • Automate Sqoop using Shell Script
  • Sqoop Export from Hadoop to RDBMS

 

Hive / Impala

  • Hive Introduction
  • Datatypes in Hive
  • Architecture of Hive
  • How to create database, table using different file formats
  • Different ways to load data in Hive Tables
  • Views in hive
  • External vs Internal Tables
  • Partitioning vs bucketing
  • Static vs Dynamic Partitioning
  • Joins In Hive
  • Map side joins in Hive
  • Analytical functions in Hive
  • Performance tuning
  • Hive shell vs Beeline Shell
  • Hive Executions Modes – MapReduce, Tez/Spark
  • What is Impala and how it is different from Hive
  • Assignment

 

Scala

  • Scala Basics
  • Variable, Strings and Numbers
  • Arrays, List, tuple, Map
  • For loop, if-else and Match
  • Functions and Objects/ Class
  • What is the case class in Scala
  • The Scala REPL
  • How to write & Run Scala Program in IDE
  • Assignment

 

Spark

  • Introduction to Spark
  • What are RDDs?
  • How to Create RDDs
  • Transformations in RDD
  • Actions in RDD
  • Lazy evaluation in Spark
  • Lineage Graph in RDD
  • What are paired RDDs and when are they used
  • What are data frames in spark and how they are different from RDDs
  • How to create Dataframes
  • How to load data from RDBMS into Hadoop using Spark
  • How to perform transformations using DataFrame API
  • What is broadcast join in Spark
  • Cache vs Persist in Spark
  • Performance tuning in spark
  • What are datasets in spark and how are they different from Dataframes API
  • Assignment on Spark

Lakesh kumar

IT professional having 7 years of experience with 5 years of extensive experience in Hadoop big Data technologies. Well versed with technology Mapreduce, HDFS, SQOOP, HIVE, SPARK, UNIX, and Scala.

No reviews found

Batch Start Date End Date Timings Batch Type
No video found

Description

The course is designed for students who want to start their career in Big Data or for the professional folks who want to shift their career from current technology into Big Data. During the course, Google Cloud will be used. Candidates will be helped with interview questions and resume preparation also. The course will cover Mapreduce, HDFS, SQOOP, HIVE, SPARK, UNIX, and Scala.

This is an instructor-led course with an average batch size of 5 students. In the 60 hours of Online Live training, you will get both the theoretical and practical knowledge needed to build the necessary skills. The institute’s holistic approach is stemmed to meet the long-term needs of the student and hence they provide 100% job/placement assistance with the option of seeking a trial class before the enrolment.

What Will I learn?

  • Big Data Introduction, Four Vs of Big Data and MapReduce and HDFS
  • Unix Concepts, Basic Unix Commands and How to write a shell script
  • Scala Basics, Variable, Strings and Numbers

Specifications

  • Free Demo
  • Learn from Experts
  • Interactive Learning
  • Instalment Facility
₹25,000 ₹ 25,000

Hurry up!! Limited seats only

No Comments

Please login to leave a review

Related Classes